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Abstract — Recently there has been a lot of success in using the 
deterministic approach to provide approximate characterization 
of Gaussian network capacity. In this paper, we take a determin- 
istic view and revisit the problem of wiretap channel with side 
information. A precise characterization of the secrecy capacity 
is obtained for a linear deterministic model, which naturally 
suggests a coding scheme which we show to achieve the secrecy 
capacity of the degraded Gaussian model (dubbed as "secret 
writing on dirty paper") to within half a bit. 

Index Terms — Dirty-paper coding, information-theoretic secu- 
rity, linear deterministic model, side information, wiretap channel 



I. Introduction 

In information theory, an interesting and useful communi- 
cation model is a state-dependent channel where the channel 
states are non-causally known at the transmitter as side in- 
formation. Of particular importance is a discrete-time channel 
with real input and additive white Gaussian noise and inter- 
ference, where the interference is non-causally known at the 
transmitter as side information. 

Costa III was the first to study this communication scenario, 
which he whimsically coined as "writing on dirty paper." 
Based on an earlier result of Gel'fand and Pinsker {2J, Costa 
UJ proved the surprising result that the capacity of writing 
on dirty paper is the same as that of writing on clean 
paper without interference. Since |JJ, dirty -paper coding has 
found a wide range of applications in digital watermarking 
and network communications, particularly involving broadcast 
scenarios. 

Recent works J5) and 2) studied the problem of dirty-paper 
coding in the presence of an additional eavesdropper, which 
is a natural extension of Costa's dirty-paper channel to the se- 
crecy communication setting. In this scenario, which we dub as 
"secret writing on dirty paper", the legitimate receiver channel 
is a dirty-paper channel of Costa. The signal received at the 
eavesdropper, on the other hand, is assumed to be a degraded 
version of the signal received at the legitimate receiver. An 
achievable secrecy rate was established based on a double- 
binning scheme and was shown to be the secrecy capacity of 
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the channel under some channel parameter configurations Q, 
0|. For the general channel parameter configuration, however, 
the secrecy capacity of the channel remains unknown. 

In facing some challenging Gaussian network communi- 
cation problems, recent advances Q, Q in network infor- 
mation theory advocate a deterministic approach and seeks 
approximate characterization of the network capacity to within 
finite bits (regardless of the received signal-to-noise ratios). 
Motivated by the success of and (6), in this paper we take 
a deterministic view and revisit the problem of wiretap channel 
with side information. A precise characterization of the secrecy 
capacity is obtained for a linear deterministic model, which 
naturally suggests a coding scheme which we show to achieve 
the secrecy capacity of the degraded Gaussian model to within 
half a bit. 

The rest of the paper is organized as follows. In Sec. [Til we 
first take a deterministic view at Costa's dirty-paper channel 
and provide an approximate characterization of the channel 
capacity to within half a bit. Note that even though a precise 
characterization of Costa's dirty-paper channel is well known 
fJJ, the proposed approximate characterization establishes a 
framework for studying side-information problems via the 
deterministic approach. Building on the framework of Sec. [ill 
in Sec. [Ill] we extend the deterministic approach to the problem 
of secret writing on dirty paper and provide an approximate 
characterization of the secrecy capacity to within half a bit. 
A different but closely related communication scenario known 
as secret-key agreement via dirty-paper coding is discussed in 
Sec. [TV] Finally, in Sec. [V] we conclude the paper with some 
remarks. 

II. Writing on Dirty Paper 

A. Gaussian Model 

Consider the dirty -paper channel of Costa (TJ, where the 
received signal Y[i] at time index i is given by 



Y[i] =hX[i}+gS[i]+N[i] 



(1) 



Here, X[i] is the channel input which is subject to a unit 
average power constraint, N[i] and S[i] are independent stan- 
dard Gaussian noise and interference and are independently 
identically distributed (i.i.d.) across the time index i, and h 
and g are the (real) channel coefficients corresponding to the 
channel input and interference, respectively. The interference 
S[i] is assumed to be non-causally known at the transmitter as 
side information. The channel coefficients h and g are fixed 
during communication and are assumed to be known at both 
transmitter and receiver. 
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The channel capacity, as shown by Costa (TJ, is given by 

C = I(U ; Y) — I(U; S) 

where the input variable X is standard Gaussian and inde- 
pendent of the known interference S, and U is an auxiliary 
variable chosen as 

u = hx + lJr^ gS - (2) 

For this choice of auxiliary-input variable pair (U, X), 

I(U;Y)-I(U;S) = ±log(l + h 2 ) 

which equals the capacity of the channel (fTJ when the inter- 
ference S[i] is also known at the receiver. 

B. Linear Deterministic Model 

Consider the linear deterministic model [6] for Costa's dirty- 
paper channel (Q]), where the received signal Y[i] at time index 
i is given by 



Y[i] = D q - n X[i]®D q - m S[i}. 



(3) 



Here, X[i] is the binary input vector of length q = 
max{n,m}, S[i] is the i.i.d. interference vector whose ele- 
ments are i.i.d. Bernoulli-1/2, D = [dj^] is the q x q down- 
shift matrix with elements 



dj,k = 



1 if 2 < j = k 
otherwise 



and 77 and m are the integer channel gains corresponding to 
the channel input and interference, respectively. The vector 
interference S[i] is assumed to be non-causally known at the 
transmitter as side information. The channel gains n and m 
are fixed during communication and are assumed to be known 
at both transmitter and receiver. 

Following the result of Gel'fand and Pinsker [2j, the capac- 
ity of the linear deterministic dirty-paper channel (0) is given 
by 

C = I(U ; Y) — I(U; S) 

where the input variable X is an i.i.d. Bernoulli-1/2 random 
vector and independent of S, and U is an auxiliary variable 
chosen as 

U = Y = D q ~ n X @D q ~ m S. (4) 
For this choice of the auxiliary-input variable pair (U, X), 



I(U;Y)-I(U;S) 



I(Y;Y)-I(Y;S) 

H(Y)-I(Y;S) 

H(Y\S) 

H{D q -' l X) 

rank(D q - n ) 



which equals the capacity of the channel © when the inter- 
ference S[i] is also known at the receiver. 

We emphasize that in we may choose U = Y only 
because Y here is a deterministic function of X and S. 
In fact, for any deterministic Gel'fand-Pinsker channel (not 



necessarily linear) where the channel output Y is a determin- 
istic (bivariate) function of the channel input X and state S, 
maxp( a u) H(Y\S) is the capacity of the channel when the 
channel state S is also known at the receiver. Thus, U = Y is 
always an optimal choice for deterministic Gel'fand-Pinsker 
channels, a fact which was also observed in Q recently. 



C. Connections between the Gaussian and the Linear Deter- 
ministic Model 

A quick comparison between the Gaussian ((TJ and the linear 
deterministic OJ models reveals the following equivalence 
relationship between these two models: 



D q 



and 



D q 



(5) 



Given this equivalence relationship, the optimal choice of 
auxiliary variable U for the linear deterministic model (0 
naturally suggests the following choice of auxiliary variable 
U for the Gaussian model ([T): 



U = hX + gS 



(6) 



where X is standard Gaussian and independent of S. Com- 
pared with the optimal choice ([2j, the choice (O of auxiliary 
variable U is suboptimal. However, for this suboptimal choice 
of auxiliary-input variable pair (U,X), 



I(U;S) 
and I(U;Y) 
giving an achievable rate 



log 1 



h 2 



log(l + /7 2 +. 9 2 ) 



R 



[I(U;Y) 



1 (1 + /7 2 
2 l0S V + q- 



I(U;S)} + 
9 2 )h 2 



> 



l0g(/7 2 ) 



which is always within half a bit of the actual channel capacity 
C = -| log(l + h 2 ). Here, we denote x + := max{0, x} so that 
the achievable rates are always nonnegative. 

The fact that the choice (O of auxiliary variable U leads to 
an achievable rate which is always within half a bit of the dirty- 
paper channel capacity is well known (see JS) for example). 
However, it is interesting to see that such a choice comes up 
naturally in the context of the deterministic approach. 



III. Secret Writing on Dirty Paper 

Having understood how the linear deterministic model of J6) 
may be used to obtain an approximate characterization of the 
capacity of Costa's dirty-paper channel, next we shall extend 
the deterministic approach to the problem of secret writing on 
dirty paper. 
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Fig. 1. Wiretap channel with side information. 

A. Discrete Memoryless Model 

As illustrated in Fig. Q] consider a discrete-time memory- 
less wiretap channel with transition probability p(yi, y 2 \x, s), 
where X[i] is the channel input (at time index i), S[i] is 
the channel state, and Y\ [i] and Y 2 [i] are the received signals 
at the legitimate receiver and the eavesdropper, respectively. 
The channel state S[i] is i.i.d. across the time index i and 
is assumed to be non-causally known at the transmitter as 
side information. The transmitter has a message W, which 
is intended for the legitimate receiver but needs to be kept 
asymptotically perfectly secret from the eavesdropper. Follow- 
ing the classical works J5) and ifTOl . it is required that 



-I(W;Y?) ->0 
n 



(7) 



in the limit as the block length n — s- 00, where Y 2 n := 
(^[l], ■ • ■ , ^2 M ) ■ The secrecy capacity C s is defined as the 
largest secrecy rate that can be achieved by a coding scheme. 

Chen and Vinck |4] derived a single-letter lower bound on 
the secrecy capacity (an achievable secrecy rate), which can 
be written as 



C s > max p(u . ;r | s) min {I{U; Fi) - I(U ; S), 

HJJ;Y{)-I{U\Y 2 )} 



where U is an auxiliary variable satisfying the Markov chain 
U^(X, S)^(Y 1 ,Y 2 ). 

We also have the following simple upper bound on the 
secrecy capacity. 

Proposition 1: The secrecy capacity C s of a discrete mem- 
oryless wiretap channel p(yi, y 2 \x, s) with channel state S 
non-causally known at the transmitter as side information can 
be bounded from above as 

C s < maxminl/^y^),/^^;^^)}. (9) 

p(x I s) 

Note that maxp^^) I(X; Y\\S) is an upper bound on 
the Shannon capacity of the legitimate receiver channel by 
giving the channel state S to the legitimate receiver, and 
rnaxp^s) I(X, S; Y\ \Y 2 ) is an upper bound on the secrecy 
capacity of the wiretap channel by allowing the transmit 
message W to be encoded by the channel state S (i.e., fully 
action-dependent state iPTTI ) and by giving the received signal 
Yi to the legitimate receiver. Here, a simple single-letterization 
technique of Willems {12| allows the maximizations to be 
moved outside the minimization. See Appendix [A] for the 
details of the proof. 



For semi-deterministic channels where the channel output at 
the legitimate receiver is a deterministic (bivariate) function of 
the channel input and state, the lower ([S]) and the upper © 
bounds coincide, leading to a precise characterization of the 
secrecy capacity. The result is summarized in the following 
theorem. 

Theorem 1: Consider a discrete memoryless wiretap chan- 
nel p(yi, y 2 \x, s) with channel state S non-causally known at 
the transmitter as side information. If the received signal Y\ 
at the legitimate receiver is a deterministic function of the 
channel input X and state S, i.e., Yi = f(X,S) for some 
bivariate function /, the secrecy capacity C s of the channel is 
given by 

C a = max rmn{#(ri|S),ff(Yi|r 2 )} . (10) 

p{x\s) 

Proof: The fact that 

C s > max min {HQT^S), H(Yi \Y 2 )} 

p(x I s) 

follows from the lower bound (H)) by setting U = Y\ (we may 
do so only because here Y\ is a deterministic function of X 
and S), which gives 

I(U; Yi) - I(U; S) = H(Y 1 ) - I(Y V , S) = H(?i\S) 

and similarly 

I{JJ\Yi) - I(U-Y 2 ) = H(Y 1 ) - H(Yi\Y 2 ) = H(Y 1 \Y 2 ). 

The converse part of the theorem follows from the upper 
bound (0 and the fact that Y\ is a deterministic function of 
(X, S), so we have 



HXiY^S) = H(Yi\S) - H(Yi\X, S) - H^S) 



and 



(8) I(X,S; Yx\Y 2 ) = H{Yx\Y 2 ) - H{Yi\X,S,Yi) = H(Y 1 \Y 2 ). 



This completes the proof of the theorem. ■ 
Note that when the channel state S is deterministic, a semi- 
deterministic wiretap channel with side information reduces 
to a regular semi-deterministic wiretap channel without side 
information. In this case, let S be a constant in ( fTOb and we 
have 

C s = maxmin{i7(Yi),i7(y 1 |r 2 )} = maxiJ(Yi|y 2 ) 

p(x) p(x) 

which recovered the result of lfl3ll on the secrecy capacity of 
the semi-deterministic wiretap channel (without side informa- 
tion). 

B. Linear Deterministic Model 

Next, let us use the result of Theorem Q] to determine the 
secrecy capacity of a linear deterministic wiretap channel with 
side information. In this model, the received signals (at time 
index i) at the legitimate receiver and the eavesdropper are 
given by 



Yi[i] = Di- ni X\i}@Di- m ^S[i] 
Y 2 [i] = D q ~ n2 X[i] © D q ~ m2 S[i] 



(11) 
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where X\i] is the binary input vector of length q = 
max{ni,n 2 ,mi,m 2 }, S[i] is the i.i.d. vector interference 
whose elements are i.i.d. Bernoulli- 1/2, D is the q x q 
down-shift matrix, and m, n 2 , mi and m 2 are the integer 
channel gains. The vector interference S[i] is assumed to be 
non-causally known at the transmitter as side information. 
The channel gains n\, n 2 , mi and m 2 are fixed during 
communication and are assumed to be known at all terminals. 

The following theorem provides an explicit characterization 
of the secrecy capacity of the linear deterministic wiretap 
channel (fTTb with side information. 

Theorem 2: The secrecy capacity C s of the linear deter- 
ministic wiretap channel (fTTb with side information is given 
by 



C s 



ni, 



max {mi, ni 



n 2 ) + , 



if m — mi 7^ n 2 — m 2 , 
ni < m i or Tl 2 < TTl 2 
n 2 + m 2 } , 

if Hi — mi 7^ n 2 — m 2 , 

rii > TUi and n 2 > m 2 
if ni — mi = n 2 — TO2 . 



(12) 



To prove Theorem [2j let us first prove the following propo- 
sition. 

Proposition 2: The secrecy capacity C s of the linear deter- 
ministic wiretap channel ( fTTT ) with side information is given 
by 



C s = min < m, rank 



A 
B 



rank(B) 



where 



A := 
and B := 



Jjq-n.2 Jjq-m 2 



(13) 



(14) 



Proof: To prove (fTJt , we shall show that for the linear 
deterministic model ([TTJ, both H(Y 1 \S) and H(Y X \Y 2 ) are 
simultaneously maximized when X is an i.i.d. Bernoulli-1/2 
random vector and independent of S. 
First, 

H(Yx\S) = H(D q ~ ni X\S) 

< H(D q - ni X) 

< rank{D q - ni ) 

= ni (15) 

where the equalities hold when X is an i.i.d. Bernoulli-1/2 
random vector and independent of S. 

To show that H(Yi\Y 2 ) is also maximized when X is an 
i.i.d. Bernoulli-1/2 random vector and independent of S, we 
shall need the following technical lemma, which can be proved 
using a counting argument as provided in Appendix [B] 

Lemma 1: For any matrices A and B in F2 (Galois field of 
size 2) that have the same number of columns, 



max H(AZ\BZ) = rank 



A 
B 



rank(B) (16) 



where the maximization is over all possible binary random 
vector Z. The maximum is achieved when Z is an i.i.d. 
Bernoulli-1/2 random vector. 



Now let 

By Lemma Q] 

H(Yi\Y 2 ) 



X 
S 



= H(AZ\BZ) 
A 



< rank 



B 



rank(B) (17) 



where the equality holds when X is an i.i.d. Bernoulli-1/2 
random vector and independent of S. 

Substituting ( fTBI l and (fTTI i into (TTOb completes the proof of 
the proposition. ■ 

Given Proposition |2] the explicit characterization (fT2l of the 
secrecy capacity C s can be obtained from dot by evaluating 
the rank of the matrices 

" A 
B 

and B. The details of the evaluation process are provided in 
Appendix [C] 



C. Degraded Gaussian Model 

Finally, let us consider the Gaussian wiretap channel where 
the received signals (at time index i) at the legitimate receiver 
and the eavesdropper are given by 



Yi[i] = hiXlij+giSiil + N^] 
Y 2 \i] = h 2 X\i}+g 2 S\i}+N 2 \i}. 



(18) 



Here, X[i] is the channel input which is subject to a unit 
average power constraint, JVft [i] , k = 1,2 and S[i) are 
independent standard Gaussian noise and interference and are 
i.i.d. across the time index i, and hi, h 2 , gi and g 2 are the 
(real) channel coefficients. The interference S[i] is assumed to 
be non-causally known at the transmitter as side information. 
The channel coefficients hi, h 2 , gi and g 2 are fixed during 
communication and are assumed to be known at all terminals. 

A single-letter expression for an achievable secrecy rate was 
given in (|8), which involves an auxiliary variable U. However, 
it is not clear what would be a reasonable choice of U, letting 
alone finding an optimal one that maximizes the achievable 
secrecy rate expression ([8j. On the other hand, for the linear 
deterministic model ( fTTT ), it is clear from Theorem [JJ and 
Proposition [2] that the following choice of auxiliary variable 
U is optimal: 



U = Yi = D q - ni X®D q 



(19) 



where X is an i.i.d. Bernoulli-1/2 random vector and inde- 
pendent of S. 

Based on the equivalence relationship ([5]l between the 
Gaussian and the linear deterministic model and the success 
of Sec. UJJ for Costa's dirty-paper channel, the optimal choice 
dT9b of auxiliary variable U for the linear deterministic model 
(fTTI) suggests the following choice of auxiliary variable U for 
the Gaussian model (fist : 



U = hiX + giS 



(20) 



5 



where X is standard Gaussian and independent of S. For this 
choice of auxiliary-input variable pair (U, X), 



I(U; S) 



I(U;Yi) = -\og{\ + h\+gl) 



(21) 
(22) 



1 (h? + &)(! + h% + gl) 
and I(U;Y 2 ) = ^og ,\^J] , ),Z 2 [ y2 [ 9 (23) 



giving 



and 



2 b hl+gj + (h ig2 -h 2gi y 



HUM) - I(U;Y 2 ) 

i, (i + ^ + gi)[fei + g? + (M2-fe 2 ffi) 2 ] 

■ 2 og (ft?+fl?)(i + /»i+fl a a ) 

By the single-letter achievable secrecy rate expression ©, 

d _ f mhl J l i no . (l+fej+gp^ 
# s - (mm 1 5 log h?+ff? , 

2 l0 § (^+9?)(l+^+9i) J J 

is an achievable secrecy rate for the Gaussian wiretap channel 
([T8T l with side information. 

Following the works |3| and |4]. below we focus on the 
special case where 



Phi and g 2 = Pgi 



(25) 



for some \P\ < 1. Note that the secrecy capacity of the channel 
(fT8T l does «of depend on the correlation between the additive 
Gaussian noise Ni[i] and N 2 [i], so we may write 

N 2 [i] = pN 1 [i] + N[i] 

where N[i] is Gaussian with zero mean and variance 1 — /? 2 
and independent of Ni[i]. Thus, for the special case of d25l ). 
the channel ( fl~8b can be equivalently written as 



Yi[i] = hiX[i\+giS[i\ + Ni[i\ 
Y 2 \i] = PYi[i]+N[i] 



(26) 



i.e., the received signal Y 2 [i] at the eavesdropper is degraded 
with respect to the the received signal Yi[i] at the legitimate 
receiver. 

Following jl |, an interesting interpretation of the degraded 
Gaussian model d26l ) is "secret writing on dirty paper." In this 
scenario, a user intends to convey (to a legitimate receiver) 
a confidential message on a piece of paper with preexisting 
dirt on it. The legitimate receive has access to the original 
paper with the message written on it and hence can decode 
the intended message. On the other hand, the eavesdropper can 
only access a noisy copy of the original paper, from which 
essentially no information on the conveyed message can be 
inferred. 

Next, we show that for the degraded Gaussian model ( |2oT ). 
the achievable secrecy rate ( f24b is always within half a bit of 
the secrecy capacity. The result is summarized in the following 
theorem. 



Theorem 3: For the degraded Gaussian wiretap channel 
with side information, the secrecy capacity C s can be 
bounded as 



< 



mm | ^ log h . i+gi , 5 log 1+ ^ (h?+ff?) |J 

C s < min { | log(l + hi), | log gj^jjgfe } ■ 

(27) 

Moreover, the lower bound here is always within half a bit of 
the upper bound. 

Proof: The lower bound in ( f27l ) follows from ( f24l > and 
the degradedness assumption d25l l. To prove the upper bound, 
note that for any input variable X such that E[X 2 ] < 1 we 
have 

I(X;Yi\S) = h(Yi\S)-h(Yi\X,S) 

= h(hiX + Ni\S) - h(Ni) 

< h{h x X + Ni) - h(Ni) 
1 



< |log(l + /i?Var(X)) 

< ilog(l + / l 2 ). 



(28) 



Furthermore, 



I(X,S;Yi\Y 2 ) 

= h(Yi\Y 2 )-h(Yi\X,S,Y 2 ) 

= h(Yi\pYi + N) - h(Ni\pNi + N) 

= h(Yi \pYi log (27re(l - /3 2 )) . (29) 

By an inequality of Thomas |14l Lemma 1] and the indepen- 
dence between Y\ and N, 

Note that the right-hand side of ( f30l > is a monotone increasing 
function of Var(Yi), which can be bounded from above as 

Var(Fi) = Var(/iiX + .gi5' + A r i) 
= Vav{hiX + giS) + 1 

< 2 (Var(/nX) + Var(giS)) + 1 

< 2hl + 2gj + 1. 

Hence, 

h(Yi\pYi + N) 

1 27re(2/. 2 + 2ff 2 + l)(l~/3 2 ) 
" 2 0g /3 2 (2^ 2 + 2. 9 2 + l) + (l-^) 

l 1 _ 2 7 re(2/i 2 + 2 g 2 + l)(l-/? 2 ) 
- 2 10g 2 p*(hl+gl) + l ■ (M) 

Substituting ( fJTt into ( |29l , we have 

"^^^ w^V (32) 

Further substituting d28] ) and d32b into (O establishes the upper 
bound in d27l) . 
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To show that the lower bound is always within half a bit of 
the upper bound, let us define 

a := -log(l + /if) 



1 2{h\+gi) + l 
2 10g 2^{h\ +gf) + l 

1 (1 + h? 

2 log 



and <i 



1 



loe 



ft? +5? 

1 + hi + .g? 



l+^(^ +5 2)' 

We shall consider the following two cases separately. 
Case 1: h\ < 1. In this case, 

a = \ log(l + hi) < - 

and the gap between the upper and the lower bound can be 
bounded from above as 

1 



min{a, b} — (min{c, d}) < min{a, b} < a < 
Case 2: h\ > 1. In this case, 

a-c = ^og(l + ^)-^log (1+ ^ +g 2 

< ibg(i + fc?)-ibg(A?) 



(33) 



1 



lo K 1 



1 



1 

< -. 

- 2 



(34) 



Note that for any channel parameters hi, g\ and (3, 

b-d = 



l 1n . 2(/if+ ffl 2 ) + l 
2 S 2/3 2 (/i 2 +.g 2 ) + l 



1, 1 + 
2 ° S 1 + + 



.91 



< 



< 



log 
log 
log 



2{h\ + 5 ?) + 1 1 + ^(^+3?) 



2(fr 2 + g 2 ) + l 
1 + h\ + g\ 

2{hj + gj) + 2 
1 + h\ + g\ 



(35) 



and for any real scalers a, b, c and d, 

min{a, b} — (min{c, d}) + 

< min{a, b} — min{c, d} 

= max{min{a, b} — c, min{a, b} — d} 

= max{a — c, b — d}. 

Substituting d34l i and d35l l into d36*l l, we have 

1 11 1 



min{a, b} — (min{c, d}) < max 



2' 2 



(36) 



(37) 



Combining the above two cases proves that the lower bound 
in d27| i is always within half a bit of the upper bound. This 
completes the proof of the theorem. ■ 



Finally, we note that the work |3 j considered, as a heuristic 
choice, the auxiliary variable 



U = hiX + agiS 



(38) 



where X is standard Gaussian and independent of S, and a is 
chosen to maximize the achievable secrecy rate. A closed-form 
expression for the maximizing a can be written as 

if < h\ < h\ L 



o 



where 



P 2 hl{\ gi \ + ^hl+gl+l/ f}^) 
|ffi|(l+/3 2 ^) : 

1, 



if h\ L <h\< h\ H 

if h\ > h\ H 



h\ h 



9l ! , \9i\ I 2 ~ 4 ~ 



and 



9l 



Iffil 
2 



9! 



4 

2 ' 2 f 1 ' f 

Thus, for h\ > hf H , the heuristic choice d38l with the 
maximizing a coincides with the choice U = h\X + giS 
suggested by the linear deterministic model. 

A numerical comparison between the achievable secrecy 
rates for choosing a = a* and a = 1 in d38l as well 
as the upper bound in d27l i is provided in Figure [2] As we 
can see, when /if (which represents the received signal-to- 
noise ratio at the legitimate receiver) is small, the choice 
a = 1 (as suggested by the linear deterministic model) can be 
very suboptimal in maximizing the achievable secrecy rate. 
However, in this case, the secrecy capacity of the channel 
is also small, so the achievable secrecy rate given by the 
suboptimal choice a = 1 remains within half a bit of the 
secrecy capacity. For small Iif, substantial improvement to 
the achievable secrecy rate can be made by optimizing over 
a. In fact, when hi < hi L , the achievable secrecy rate given 
coincides with the upper bound and hence gives 



the exact secrecy capacity of the channel. When hi is large, 
the maximizing a approaches 1 (it is exactly equal to 1 when 
hi > hi H ), and both choices lead to achievable secrecy rates 
which are within half a bit of the secrecy capacity. 

IV. Secret-Key Agreement via Dirty-Paper Coding 

A different but closely related communication scenario is 
secret-key agreement via dirty-paper coding, which was first 
considered in lfT31l . In this setting, the channel model is 
exactly the same as that for secret writing on dirty paper. 
The difference is in the goal of communication. For secret 
writing on dirty paper, the goal is to convey to the legitimate 
receiver a secret message W, which is pre-chosen and hence 
is independent of the known interference For secret- 

key agreement, the goal is to establish, between the transmitter 
and the legitimate receiver, an agreement on a secret key K, 
which must be kept asymptotically perfectly secret from the 
eavesdropper, i.e., 

i/(#;Y 2 ")^0 

in the limit as the block length n — > oo. The secret-key capac- 
ity Ck is defined as the largest entropy rate (1/n) log if (A") 
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cc 




h? (dB) 




(dB) 

Fig. 2. A numerical comparison between the achievable secrecy rates for choosing a = a* and a = 1 in 138) . Both a* and the achievable secrecy rate R s 
are plotted as a function of h\, while g\ and /3 are fixed to be 1 and 0.5, respectively. 



that can be achieved by a coding scheme. Unlike the problem 
of secret writing on dirty paper, the secret key K can be 
potentially correlated with the known interference 
Hence, the secret-key capacity Ck is at least as large as the 
secrecy capacity C s for the same wiretap channel. 

For a general discrete memoryless wiretap channel with 
side information, the secret-key capacity Ck is unknown. The 
following lower and upper bounds were established in 031 : 

max [I(U ; Y x ) ~ I(U; Y 2 )} <C K < max I{X, S; Y 1 \Y 2 ) 

p{u,x\s) p(x\s) 

(39) 

where U is an auxiliary variable satisfying the Markov chain 
U (X, S) {Y U Y 2 ) and such that 



I(U;Y 1 )-I(U;S)>0. 



(40) 



For semi-deterministic wiretap channels where the received 
signal Yi at the legitimate receiver is a deterministic bivariate 
function of the channel input X and state S, the lower bound 
in d39l with the choice of auxiliary variable U = Y% coincides 
with the upper bound, giving an exact characterization of the 
secret-key capacity 



C 



K 



max7J(ri|F 2 ) 

p(x\s) 



(41) 



Note here that the choice U = Y\ always satisfies the 
constraint (|40l . 

For the linear deterministic wiretap channel ( fTTT i with side 
information, by Lemma [TJ the conditional entropy H(Yi\Y 2 ) 
is maximized when the input variable X is standard Gaussian 
and independent of S. By the equivalence relationship (O 
between the linear deterministic and the Gaussian model, this 



suggests the following choice of auxiliary variable U for the 
degraded Gaussian model (1261 : 

U = h 1 X + g 1 S (42) 

where X is standard Gaussian and independent of S, as 
long as (gO} is satisfied. Substituting d42t . (l2TTi-(|23Ti, and the 
degradedness assumption d25l l into ( |39l and (l40l i. we have the 
following lower and upper bounds on the secret-key capacity 
Ck of the degraded Gaussian model 



- los 

2 e 



l + h\ 



9l 



1 



Pih\+gD- CK - 2 108 2^(hj- 
for all channel coefficients hi and g\ such thafl 

2 



2{hl+gl) + l 



9l) 



9l 



9l 



- 1 
(43) 



(44) 



By d35K the lower bound in (T43b is always within half a bit 
of the upper bound. 

We mention here that lfl5ll also considered, as a heuristic 
choice, the auxiliary variable U of form ( f42l > where X is 
standard Gaussian. However, instead of choosing X to be 
independent of S as suggested by the linear deterministic 
model, ifTSIl considered X which is correlated with 5* and with 
correlation coefficient p = ¥\XS] = p* , where 



h\ + gl 



{l + h\+gl)h\ 



sgn{h\gi). 



(45) 



'The upper bound is valid for all channel parameters. Due to the constraint 
)40) , when h\ < h^ T the linear deterministic model does not appear to 
provide any insight on how to choose the auxiliary variable U for the degraded 
Gaussian model. 
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Fig. 3. A numerical comparison between the achievable secret-key rates for 
choosing p = p* and p = in )42t . The achievable secret-key rate Rjf 
are plotted as a function of h\, while gi and /3 are fixed to be 1 and 0.5, 
respectively. 



Here, sgn(x) denotes the sign of real scalar x. It is straightfor- 
ward to verify that condition ( l44t guarantees the existence of 
p*. A numerical comparison between the achievable secret-key 
rates for choosing the correlation coefficient p = p* and p = 
as well as the upper bound in d43l l is provided in Figure [3] As 
we can see, even though the choice p = is suboptimal in 
maximizing the achievable secret-key rate, both choices lead 
to achievable secret-key rates that are within half a bit of the 
secret-key capacity for hi > h\ T . 

V. Concluding Remarks 

In this paper, we took a deterministic view and revisited the 
problem of wiretap channel with side information. A precise 
characterization of the secrecy capacity was obtained for a 
linear deterministic model, which naturally suggests a coding 
scheme which we showed to achieve the secrecy capacity of 
the degraded Gaussian model (dubbed as "secret writing on 
dirty paper") to within half a bit. 

This paper falls in the line of using the linear deterministic 
model to provide approximate characterization of Gaussian 
network capacity, an approach which has become increasingly 
popular in information theory literature. However, our method 
is somewhat different from most of the practices along this 
line of research. In literature, a common practice has been 
to first gain "insight" from the capacity-achieving scheme for 
the linear deterministic model and then translate the success 
to the Gaussian model at the scheme level. To the best 
of our understanding, such translations are more art than 
science. For the problems that we considered in this paper, 
the translation of success from the linear deterministic model 
to the Gaussian model was done at the level of a single-letter 
description of channel capacity and hence was much more 
systematic. Our ongoing work aims at understanding to what 
extent this method can be applied to more complex network 
communication scenarios. 



Appendix A 
Proof of PropositionQ] 

By Fano's inequality, any achievable secrecy rate R s must 
satisfy 

n(R s -e n ) < I(W\Y?) 

< I(W;Yf,S n ) 
= I(W;Y{ l \S n ) 

< I(X n ;Y 1 n \S n ) 

= H{Y?\S n ) - H{Y 1 n \X n ,S n ) 



H{Yr\S n )-J2H(Y 1 \i}\X\i],S\i}) 



i=i 



< 



^HiY^lSW-^HiY^lX^SM) 



i=i 



= n[H(Yx iQ \S Q ,Q) - H(Y l!Q \X Q ,S Q ,Q)} 

= n[H(Y ltQ \S Q ,Q)-H(Y ltQ \X Q ,S Q )] 

< n[H(Y hQ \S Q ) - H(Y hQ \X Q , S Q )} 

= n-I(X Q ;Y hQ \S Q ) 

where e n — >• in the limit as n —> oo, and Q is a standard 
time-sharing variable. 

Similarly, for any achievable secrecy rate R s we have 

n(R s - e n ) 

< I{W;Y?)- I(W;Y?) 

< I(W;Y?,Y?)-I(W;Y?) 
= I(W;Y?\Y 2 n ) 

< I(X n ,S n ;Y{ l \Yi l ) 

= ff(y™|Y") - H(Y?\X n , S n , F 2 ") 



H{Y?\Y?)-Y,H{YMX[^S[i],Y, 



2\l 



i=l 



< 



i=l 



= n[H(Y liQ \Y 2iQ ,Q) - H(Y hQ \X Q , S Q , Y 2>Q , Q)] 
= n[H{Y ltQ \Y 2tQ ,Q) - H(Y^ Q \X Q ,S Q ,Y 2 , Q )] 
< n[H{Y ltQ \Y 2tQ ) - H{Y ltQ \X Ql S Q ,Y 2 , Q )] 
= n-I{X Q ,S Q ;Y 1>Q \Y 2 , Q ). 

Note that the channel states are memoryless, so Sq has 
the same distribution as S[i] for any i = l,...,n. The 
channel is also memoryless, so the conditional distribution of 
(Yi ! q,Y 2 ^q) given (Xq,Sq) is given by the channel transi- 
tion probability p(yi,y 2 \x, s). Letting Xq = X, Sq = S, 



Y 



Yi, Y 2 . 



Y 2 , and n — > oo completes the proof of 



the proposition. 



Appendix B 
Proof of LemmaQ] 

Let Z be an i.i.d. Bernoulli-1/2 vector. We have 

.4 



H(AZ\BZ) = H 



D 



Z ) - H(BZ) 
— rank(B). 
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We thus conclude that 

max H(AZ\BZ) > rank 



A 
B 



- rank(B). (46) 



To prove the reverse inequality, let us consider the null space 
of B and its coset partition based on the null space of 

.4 
l B 

Fix BZ — b. Then, any solution Z can be written as the sum 
of a particular solution Z p and a vector Zh in the null space 
of B. Note that all vectors Zh in the same coset of the null 
space of B relative to the null space of 

A 
B 

give the same value for AZh- Thus, the number of different 
values that AZ can take for any given value of b equals the 
number of cosets in the null space of B, which is given by 



-rank(B) 



nullitiy(B)— nullity 

2 

We thus conclude that 

maxH(AZ\BZ) < rank 



' A ' 


rank I 


' A ' 


) 


B 


1=1 V 


B 





A 
B 



rank(B). (47) 



Combining d46l ) and d47| i completes the proof of the lemma. 

Appendix C 
Proof of Theorem[2] 

The matrix B is a horizontal stack of two down-shift 
matrices with rank n 2 and m^, respectively. Since both sub- 
matrices are in reduced row-echelon form, it suffices to count 
the number of nonzero rows of B to find its rank: 



rank(B) 

The matrix 

G := 



q - min{<7 - n 2 , q 
max{ri2, m 2 }. 



m 2 } 



' A ' 




B 





]jq-n 2 



jjq-rri! 



is formed by vertically stacking two matrices A and B. Thus, 
evaluating the rank of G is equivalent to counting the number 
of zero rows along with the number of redundant nonzero rows 
between A and B (denoted by (Iab)'- 

rank(G) = 2q — mhi{q — m, q — mi} — 

min{g — n 2 ,q~ m 2 } — (Iab 
= max{ni, mi} + max{?i2, ^2} — &AB- 

To calculate <1ab, let us consider the following five cases 
separately: 

Case 1: Either ?ii < mi and n 2 > m 2 , or n 2 < m 2 and 
n\ > mi. In this case, all nonzero rows of G are independent 
so (Iab = 0. By Proposition [2] 

C s = min{ni, max{ni, mi} — 0} = n\. 

Case 2: ni < mi and n 2 < m 2 , but mi — n\ ^ m 2 — n 2 . 
In this case, the redundant nonzero rows of G are given by 



the redundant rows between the top mi — ni nonzero rows of 
A and the top m 2 — n 2 nonzero rows of B. Hence, <1ab = 
minjmi — m, m 2 — n 2 }. By Proposition |2] 

C s = minjni, mi — mm{?7ii — ri\, m 2 — n 2 }} 
= min{ni, max{ni, mi — m 2 + n 2 }} 
= n\. 

Case 3: ni > mi and n 2 > m 2 , but mi — n\ ^ m 2 — n 2 . 
In this case, the redundant nonzero rows of G are given by 
the redundant rows between the top n\ — mi nonzero rows of 
A and the top n 2 — m 2 nonzero rows of B. Hence, (Iab = 
minjrii — mi, n 2 — m 2 }. By Proposition |2] 

C s = minjni, ?ii — min{ni — mi, n 2 — m 2 }} 
= 111 — minjni — mi, n 2 — m 2 } 
= max{mi, rii — n 2 + m 2 }. 

Case 4: ni — mi = n 2 — m 2 and rti > mi. In this case, 
the redundant nonzero rows of G correspond to the redundant 
nonzero rows of 

£)q-n 2 

so (Iab = min{ni,ri2}. By Proposition |2] 

C s = minjni, ni — minjni, n 2 }} 
= ni — min{rii, n 2 } 
= {ni-n 2 ) + . 

Case 5: ni — mi — n 2 — m 2 and ni < mi. In this case, 
the redundant nonzero rows of G correspond to the redundant 
nonzero rows of 

Jjq-mi 
jjq-m 2 

so cLab = min{mi,m.2}. By Proposition [2] 

C' s = minjni, mi — min{mi, m2}} 

= min{ni, (mi — m.2) + } 

= min{ni, (ni - n 2 ) + } 

= {ni-n 2 ) + . 

Combining the results from the above five cases completes 
the proof of ( fl~2b and hence Theorem [2] 
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